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1. INTRODUCTION 

A majority of content-based image retrieval (CBIR) systems are aimed toward accuracy. More 
modern approaches incorporate CBIR approaches together in order to produce CBIR systems [1]. A 
hypothetical (hypothetical) situation might be, for instance, in 2016 Dr. K. Mala, A. Anandh, and S. Suganya 
suggested CBIR methods that combines numerous feature extraction approaches for the purpose of 
increasing image retrieval accuracy [2], [3]. Content-based system of image retrieval is made using the Gabor 
Wavelet Feature, Color Auto-Correlogram Feature, and Wavelet Transform features, which are all tied 
together to create it [2]. This graph clearly illustrates an improvement in the accuracy of the findings [3]. In 
order to reduce computing costs, as the accuracy of each feature extraction methods improves, retrieval 
speeds can be decreased [3]. It is critical to increase system performance in order to enhance system accuracy 
[4], [5]. 

Some changes have been made to the database in order to improve retrieval speed. As an 
intermediate step, an image comparison has completed and a picture has been placed in the database. With 
the suggested technology, the findings demonstrate a considerable performance gain [3]. This report consists 
of three parts. In summary, the research discussed topics such as the introduction, method, related work, 
experimental results, and discussions and conclusions, as well as future works. The introduction in section 
one dealt with the topic of content-based systems of image retrieval, the stated task, the suggested solution, 
and a preliminary report overview. Content-based picture retrieval approaches are presented in Section 2. 
Comparisons between the red, green and blue (RGB) model and the hue, saturation and value (HSV) model, 
the way of converting RGB model into HSV model, as well as the way of converting RGB model into HSV 
model are shown below. The content-based image retrieval mechanism has been explained (Method). This 
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section introduces picture comparison and picture retrieval methods. It includes the design, algorithm, and 
implementation of the database. Because of the excessive time it takes to get images, there must be 
enhancements to the system's performance. Changes to the database structure, together with the use of an 
image retrieval method, allowed the system to run faster. Section 2 also deals in detail with the database 
structure, algorithm, and system implementation. After taking the measurements, the findings were assessed 
and visual representations of those data were made in order to select the optimal measure to obtain imagery. 
Additional comparisons were also performed in order to compare the proposed system with prior content- 
based picture retrieval system version. Section 3 contains information about analysis, outcomes, and debates 
surrounding the comparison (Experimental Results and Discussions). 


2. METHOD 

This study has been intended to create a sufficient content-based system of picture retrieval. This 
article describes a way to store histograms in a database, which greatly increases the speed at which pictures 
can be retrieved [5]. At least three different studies on picture-matching methods for content-based systems 
of image retrieval have previously been completed [6], [7]. Three distance measures were utilized to create 
the content-based system of picture retrieval, after which 3 more measures of distance have been applied in 
order to identify optimal measure of the distance for system [8]. The method of comparing existing content 
retrieval algorithms using picture comparison is demonstrate in Figure 1 [9]. 


Query Image 
Convert RGB color model to HSV 


Quantize each pixel into bins 
Calculate the similarity metric between query 


Normalized Cross} Histogram | Euclidean 


Correlation Intersection | Distance 


Retrieve similar images 


Accuracy and Performance Evaluation 


Figure 1. Comparison of image retrieval and distance measures 


2.1. Image comparison 

The original color model is RGB. It is important to note that while the RGB values fluctuate 
proportionately to light, content-based picture retrieval algorithms may not always have correct findings [10], 
[11]. Here, user input of photos and images in data-base are used in order to create an HSV model. Every one 
of the pixels in an HSV color picture is quantized into bins in order to generate the histogram [12]. The larger 
the number of bins, the more precise the findings are [13]. A contrary point of view is that it slows down the 
process. Researchers that work to offer faster results created a system that quantizes pixels to a limited 
number of the bins. Almost all of them developed their method with the help of 36 bins [14], [15]. As 
opposed to that, here, the goal is to provide accurate data, and pixels are thus quantized to (10x4x4) bins. The 
“Hue” has 10 parts, the “Saturation” has four parts, and the “Value” has four parts [16], [17]. Figure 2 
illustrates the separation of HSV color bins [18]. 
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Figure 2. Quantization of HSV color model pixels into bins 


Once the pixel values have been quantized into bins, a similarity metric has been generated for 
pictures from the database [19]. There have been several metrics of distance used to this system, including 
Euclidean distance, Normalized Cross correlation distance and histogram intersection distance [20]. 


2.2. Image Retrieval 

To find a similar image to the one the user submitted, the image is obtained from the user and then 
compared to all of the images in the database [21]. Additionally, this report provides a possible remedy to 
this issue. After the system has analyzed a picture that is supplied by the user, it makes a connection to the 
image database. There is also a path to each image that is included in the database [22]. Identification number 
and path of an image as shown in Figure 3. 


Image ID Image ID 


Figure 3. Cross-referencing of the image table and array of distance 


2.3. Analysis of results 

Using the retrieved pictures, every one of the distance measures studied in this part have been 
evaluated. A scatter plot has been used in order to aid in seeing the way that the distance is randomly 
distributed for certain pictures in each distant measure. The values of the distances are varied depending on 
the distance measurement technique used. Since it’s important to standardize the distance in order to examine 
data, it is required to standardize distance. The "Min-Max normalization" distance values were utilized for 
standardization. 


esi 
neWealue = Se X (D C) +e u) 
maxx — minx 
Lines that have predetermined limits are represented as [C, D]. To implement this system, C is 
assigned a value of zero and D is assigned a value of one hundred. Therefore, we may say that the value is 
equivalent to a Percentage. 


2.4. Implementation 

This software was built with the use of C# programming language. The MySQL data-base 
management system has been utilized in order to set up the data-base. In order to test an image comparison 
approach, an image database was built with only 27 pictures as shown in Figure 4. 
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a. Database 

The database is an organized collection of data stored and accessed electronically. The image 
database has been widely used. The database diagram depicts the structure of content-based picture retrieval 
system's initial iteration. 
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Figure 4. Image data-base 


b. Algorithm 

This Algorithm 1 is connect to database, a similarity metric has been generated for pictures from the 
database. The metrics of distance used Histogram intersection distance. A frame placed as the query image 
and distances for each image was compared in the database. 


Algorithm 1. Connect to database 
Connect to the database 
Get all records in Image table to Reader object 
Gentere histogram to user entered Image (Histogram values stored in arrayl) 
While (! Last Image in the database) 
Current Image ID and Image Path take to a 2D array"arrdb" 
Generate histogram to retrieved Image in the database (Histogram values stored in 
array2) 
Calculate similarity using Intersection Distance measure 
Store the result in dist (distance) array with Image ID 
Retrieve next record from the database 
Sort the dist (distance)array elements 
Retrieve smallest 9 distances 
Retrieve images from the folder corresponding to those distances using Image paths 
Display retrieved images 


c. Code: variable declarations 

In Algorithm 2, variable is a name given to a storage area that our programs can manipulate. Each 
variable has a specific type, which determines the size and layout of the variable's memory; the range of 
values that can be stored within that memory; and the set of operations that can be applied to the variable. 
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Bitmap newBitmap, newBitmap2 ; 
public static int no_images = 127 ; 


m) 


1821 


double r,g,b-; 

double hh, s,vj; 

double temp, min, sum ; 

int hh, ss, vv; 

int r , ] count = new int 10; 4,4]: 

// int [, y ] countl = new int [ 10, 4, 4 ]; 
// int-[ .¢ % J count? = new int P 10, 4, 4 J; 
double ] arrayl = new double [10 * 4 * 4] ; 

double ] array2 = new double [10 * 4 * 4] ; 

string ] array3 = new string [10 * 4 * 4] ; 

double J] array4 = new double I0 * 4 * AJ? 

// double L- l] mtx; 

// int arrlen; 

double distance ; 

double , ] dist = new double 2, no_images ] ; 

double ] search = new double no_images ] ; 

double ] distSort = new double [ no_images ] ; 

jf in ] numbers = new int [5] {1,5,2, 4, 3}; 
int ] arrid = new int [9]; 

String A ] arrdb = new String [ 2, no_ images ] ; 
String ] paths = new String [ 9] ; 

String ] histDb = new String [ no_images ] ; 


String string hist; 


In Algorithms 3 and 4, histograms were saved in this system. Image retrieval with the use of histogram 
intersection distance. Histogram Intersection distance are superior by evaluating the difference between 
similar pictures and different images. 


Algorithm 3. Syntax to server 

private void button4 Click ( object sender, EventArgs e ) 

{ 
String connstring = "server = localhost; database = dbcbir2; uid = root " ; 
MySqlConnection conn = new MySqlConnection ( connstring ) ; 
MySqlCommand command = conn.CreateCommand ( ) ; 


command.CommandText = " select * from images ; *; 


try 
{ 
conn.Open ( ) ; 
} 
catch ( Exception ex ) 
{ 
Console.WriteLine 
} 


(ex.Message ) ; 


MySqlDataReader reader = command.ExecuteReader ( ) ; 
int cnt = 0; 
form3 c = new form3( ) ; 


count = c.pro ( newBitmap ) ; 
arrayl = chist (count ) ; 
for (oint DSO gp Boe LO a re E) 
for (int j =0;45< 4; j++) 
for (int k=0; k< 4; k++tH+) 
{ 
count [i,j, k] = 0; 

} 
int num = 0 ; 
while ( reader.Read ( ) ) 
{ 
ärid: |. 0, -ent -] reader [ " imgid " ].ToString ( ) ; 
arrdb [ 1, cnt ] = reader [ " imgpath " ]. ToString ( ); 


Building an efficient content based image retrieval system by changing ... (Rana Jassim Mohammed) 


1822 O ISSN: 2502-4752 


Algorithm 4. Con. syntax to server 


if ( cnt == num ) 

{ 
file = Image. FromFile ( reader [ " imgpath " ] . ToString ( ) ) ; 
newBitmap2 = new Bitmap( reader [ " imgpath " ].ToString ( ) ) ; 
count = c.pro ( newBitmap2 ); 


array2 = chist ( count ) , 
For C int i = Osea w a gs det ob) 


før {int 3 = 0p pe 4A a TEE) 
for (int k= 0; k< 4; k++H4+) 
{ 
count [i, Jy k ] = 0; 
} 
Intr id = new Intr( ); 
double distance = id.intrdist ( arrayl , array2 ) ; 
dist [ 0, -ent |. = ent -i 7 
dist [1 , cnt ] = Math . Round ( distance , 6) ; 
dist [ 1, cnt] = Math.Abs ( dist [ 1, cnt ] ) ; 
distSort [ cnt ] = Math.Round ( distance , 6 ) ; 
distSort [ cnt ] = Math.Abs ( distSort [ cnt ] ) ; 
search [ cnt ] = Math . Round ( distance , 6) ; 
search [ cnt] = Math.Abs( search [cnt] ); 
} 
Cnt + + ; 
num + + ; 
} 
Array.Sort(distsort); 
int index = distSort.Length ; 
for (int k=0; k< 9; k++) 
{ 
for t int i= 0 7 i< index; i++) 
{ 
if ( dist [1, 1] == distSort [ k ] ) 
{ 
arid[k] = (int)dist[0,1]; 
break; 
} 
} 
} 
conn.Close ( ) ; 
for (int k=0; k< 9; k++) 
{ 
Eor: d int BS 0) pee BS LZ ap ee Sb a) 
{ 
if ( arrdb [ 0, 1] == arrid [ k ] .ToString ( ) ) 
{ 
paths [k ] = arrob- [1,1]; 
break ; 
} 
} 
pictureBox2.Image=Image.FromFile ( paths [ 0 ] ) ; 
pictureBox3.Image=Image.FromFile ( paths [1] ) ; 
pictureBox4.Image=Image.FromFile ( paths [ 2] ) ; 
pictureBox5.Image=Image.FromFile ( paths [ 3 ] ) ; 
pictureBox6.Image=Image.FromFile ( paths [ 4 ] ) ; 
pictureBox7.Image=Image.FromFile ( paths [5 ] ) ; 
pictureBox8.Image=Image.FromFile ( paths [ 6 ] ) ; 
pictureBox9.Image=Image.FromFile ( paths [ 7 ] ) ; 
pictureBoxl0.Image=Image.FromFile ( paths [ 8 ] ) ; 


2.5. Drawback of current system 

In the current system, the pictures req, uire a substantial amount of time to be retrieved. Currently, 
the system will check every image the user has uploaded with all of the existing images in the database [23]. 
Because this is a long procedure, it will take a little longer. Because databases typically include many photos, 
this takes a bit longer. There are many different steps involved in picture editing [24]. 
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2.6. Current system’s performance improvement 

Instead of storing complete pictures, an interim result is kept in the data-base. This database stores 
quantized bin values for every one of the images [25]. Since about half of the retrieval process is now 
completed, the system will just have to produce values and compare them to the input image to finish the task 
[18]. An idea to increase the overall system performance is presented in the Figure 5 [14]. 


Query Image 
Convert RGB color model to HSV 


Quantize each pixel into bins 
Calculate the similarity metric between query 


Normalized Cross] Histogram | Euclidean 


Correlation Intersection | Distance 


Retrieve similar images 


Accuracy and Performance Evaluation 


Figure 5. Proposed method to retrieve images 


3. RESULTS AND DISCUSSION 

Any of the three metrics can be used. This means that the value of the similarity measurement for 
similar pictures is lower than that of dissimilar ones. Nevertheless, the values that are distributed differ for all 
techniques. In contrast to the Normalized Cross Bin distance measure, the Euclidean distance as well as the 
Histogram Intersection distance techniques exhibit more disparities in the measurement of comparable 
pictures vs different images. It has been determined that the Histogram Intersection distance and Euclidean 
distance are superior by evaluating the difference between similar pictures and different images. The 
difference between comparable pictures and different images in the Histogram Intersection distance metric is 
rather large compared to that of Euclidean distance. Thus, content-based image retrieval may utilise the 
approaches of Histogram Intersection Distance as most appropriate measures of the distance. 

According to what was said previously, Histogram Intersection Distance seems as the optimum 
distance metric for this study. Histogram Intersection Distance metric is quicker in comparison with 
Normalized Cross Correlation technique even if performance is considered. The second benefit of this 
research is that it indicates that content-based system of image retrieval performance may be improved by 
accumulating intermediate results in the database rather than maintaining complete picture results. 

This color-based image retrieval system was built using this content-based image retrieval 
algorithm. System accuracy may be increased through incorporating other visual characteristics, textures, and 
forms. Color feature systems can be used with other feature extraction methods in the future. Furthermore, 
this method could be useful to content-based systems of image retrieval as it builds neural networks using the 
images found in search results. Low-level feature extractions’ approaches have been used in order to build 
feature vector and in order to train NN in the content-based system of image retrieval that has been developed 
by Haneen, Mohammed, and Faiez. This database's data will be extremely useful to any other system. 

The system 160 bins were employed to construct the content-based picture retrieval system. Bins 
count influence both the system's accuracy and its overall performance. A more complex image retrieval 
system with several bins yields more accurate findings, but the process is slowed down. The alternative 
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technique offers pictures rapidly, but with a far larger number of bins, the photographs aren't as precise. For 
color histogram distance, the appropriate number of bins should be examined and the accuracy and 
performance should be measured. 

In this case, the focus was just on improving performance, and because of that, security was not an 
issue. Regardless of the type of computer system, security is an important issue. One of the susceptible 
elements of this system is the database. To carry out this task, a pirate must enter the database system and 
make changes to it. In addition, the intruder might perhaps contribute illicit content to the database. 
Alternatively, that individual can modify the information or even erase the database data. In other words, it is 
apparent that security must be a required feature. Content-based picture retrieval systems are susceptible to 
security breaches in papers written by E. Kijak, T.T. Do, L. Amsaleg and T. Furon. Three studies were done 
to demonstrate how a pirate might lower the system's recognizing capabilities. 

A cloud-based content-based picture retrieval system can be enhanced using cloud computing. A 
cloud server is required for the data to be saved. This addition weakens the system. Researchers have 
proposed many techniques to make this system more secure. P. Saini, S. Lain, H. Singh, and S. Soni have 
presented a new method for image retrieval systems in conference papers they published in this area. 


4. CONCLUSION 

Histogram Intersection distance are superior by evaluating the difference between similar pictures 
and different images. The difference between comparable pictures and different images in the Histogram 
Intersection distance metric is rather large compared to that of Euclidean distance. Thus, content-based image 
retrieval may utilize the Histogram Intersection Distance Methods as the most appropriate distance measure. 
According to what was said previously, Histogram Intersection Distance seems to be the optimum distance 
metric for this study. The Histogram Intersection Distance metric is quicker than the Normalized Cross 
Correlation technique even if performance is considered. The second benefit of this research is that it 
indicates that content-based image retrieval system performance may be improved by accumulating 
intermediate results in the database rather than maintaining complete picture results. 
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